Age-Related Voice Disguise and its Impact on Speaker Verification Accuracy
نویسندگان
چکیده
This study focuses in the impact of age-related intentional voice modification, or age disguise, on the performance of automatic speaker verification (ASV) systems. The data collected for this study includes 60 native Finnish speakers (29 males, 31 females) with age range between 18 and 73 years. The corpus consist of two sessions of read speech per speaker. Our experiments demonstrate vulnerability of modern ASV systems when a person attempts to conceal his or her identity, by modifying the voice to sound like an old or young person. For our i-vector PLDA system, the increase in equal error rate (EER), in the case of male speakers, was 7-fold for the attempt of old voice and 11-fold for young voice. Similar degradation in performance is observed for female speakers with a 5-fold increase in EER for old voice disguise and a 6-fold increase for young voice disguise. We further analyze the factors affecting the performance of ASV systems for the studied speech data. In our experiments, male speakers were found more successful in disguising their voices. The effect on fundamental frequency (F0) was also studied. The mean F0 distributions showed a shift towards higher frequencies when speakers attempted a young voice, which relates to the perception that younger speakers’ F0 values tend to be higher than for older speakers.
منابع مشابه
Acoustical and perceptual study of voice disguise by age modification in speaker verification
The task of speaker recognition is feasible when the speakers are co-operative or wish to be recognized. While modern automatic speaker verification (ASV) systems and some listeners are good at recognizing speakers from modal, unmodified speech, the task becomes notoriously difficult in situations of deliberate voice disguise when the speaker aims at masking his or her identity. We approach voi...
متن کاملA Switch of Dialect as Disguise
Criminals may purposely try to hide their identity by using a voice disguise such as imitating another dialect. This paper empirically investigates the power of dialect as an attribute that listeners use when identifying voices and how a switch of dialect affects voice identification. In order to delimit the magnitude of the perceptual significance of dialect and the possible impact of dialect ...
متن کاملMFCC VQ based Speaker Recognition and Its Accuracy Affecting Factors
The present study was conducted to evaluate the accuracy affecting factors of a Mel-Frequency Cepstral Coefficients (MFCC) and Vector Quantization (VQ) based speaker recognition system. This investigation analyses the factors that affecting recognition accuracy using speech signal from day to day life in surrounding environments. It was studied the mismatch affects of text-dependency, voice sam...
متن کاملEffect of voice disguise on the performance of a forensic automatic speaker recognition system
This paper presents first results of an ongoing study on the effects of common types of voice disguise, including increased voice pitch (even falsetto speech), lowered voice pitch and pinching the nose while speaking, on forensic speaker recognition (FSR) techniques. Natural and disguised speech data from 100 German speakers recorded 5 times over a period of 7 to 9 months were used in a series ...
متن کاملVocal Age Disguise: The Role of Fundamental Frequency and Speech Rate and Its Perceived Effects
The relationship between vocal characteristics and perceived age is of interest in various contexts, as is the possibility to affect age perception through vocal manipulation. A few examples of such situations are when age is staged by actors, when ear witnesses make age assessments based on vocal cues only or when offenders (e.g., online groomers) disguise their voice to appear younger or olde...
متن کامل